Skip to content

feat: add VolumeAttachment wait after drain using cached client#207

Merged
Trojan295 merged 18 commits intomainfrom
wait-for-volume-attachments-in-drain
Jan 15, 2026
Merged

feat: add VolumeAttachment wait after drain using cached client#207
Trojan295 merged 18 commits intomainfrom
wait-for-volume-attachments-in-drain

Conversation

@Trojan295
Copy link
Copy Markdown
Contributor

@Trojan295 Trojan295 commented Jan 8, 2026

Adds optional feature to wait for VolumeAttachments to be deleted after node drain, preventing CSI Multi-Attach errors.

Key Changes:

  • New VolumeDetachmentWaiter - waits for VAs to be deleted using informer cache with node name index
  • Per-action control via API - waitForVolumeDetach (bool) and volumeDetachTimeoutSeconds (int) fields on drain action
  • Uses informer manager and adds VolumeAttachment informer with index on spec.nodeName

New Environment Variables:

Variable Default Description
DRAIN_DISABLE_VOLUME_DETACH_WAIT false Forcefully disables VA wait
DRAIN_VOLUME_DETACH_TIMEOUT 60s Default VA wait timeout
INFORMER_CACHE_SYNC_TIMEOUT 1m Informer sync timeout

Things to Remember

  1. Timeout is non-fatal - logs warning and proceeds with node deletion
  2. Excludes DaemonSet/static pod VAs - avoids deadlock waiting for non-evictable pods
  3. API controls feature, but it can be force disabled via env var

@Trojan295 Trojan295 force-pushed the wait-for-volume-attachments-in-drain branch from 31bfc06 to 232802c Compare January 13, 2026 08:58
@Trojan295 Trojan295 marked this pull request as ready for review January 13, 2026 15:10
@Trojan295 Trojan295 requested a review from a team as a code owner January 13, 2026 15:10
| `API_KEY` | CAST AI API key (required) | - |
| `API_URL` | CAST AI API URL (required) | - |
| `CLUSTER_ID` | CAST AI cluster ID (required) | - |
| `DRAIN_VOLUME_DETACH_TIMEOUT` | Default timeout for waiting for VolumeAttachments to detach during node drain | `60s` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a good idea? Feel free to ignore if it's not.

Suggested change
| `DRAIN_VOLUME_DETACH_TIMEOUT` | Default timeout for waiting for VolumeAttachments to detach during node drain | `60s` |
| `FALLBACK_DRAIN_VOLUME_DETACH_TIMEOUT` | Default timeout for waiting for VolumeAttachments to detach during node drain | `60s` |

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... how about DRAIN_VOLUME_DETACH_DEFAULT_TIMEOUT? So it's clearer to the user, that it's a default value, but it could be overridden by something (API payload in that case).

Comment on lines +19 to +20
// May be nil if informer sync failed.
VAWaiter VolumeDetachmentWaiter
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think the config should instead contain only static configuration and handler objects should be passed separately?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, makes sense, given that Kubernetes clients are passed seperatelly already.

r.Equal(60*time.Second, h.getVolumeDetachTimeout(req))
})

t.Run("returns per-action timeout when set", func(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests seem repetitive, could they be rewritten as table tests?

Copy link
Copy Markdown
Contributor

@Tsonov Tsonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, mostly polishing questions

@Trojan295 Trojan295 force-pushed the wait-for-volume-attachments-in-drain branch from b58ae66 to 1a244a6 Compare January 14, 2026 13:24
Add optional feature to wait for VolumeAttachments to be deleted after
draining a node, preventing Multi-Attach errors when CSI drivers need
time to clean up volumes before the node is terminated.

Uses controller-runtime's cached client with field indexes for efficient
VolumeAttachment and Pod queries without hitting the API server directly.
Follows Karpenter's approach of excluding VAs from non-drainable pods
(DaemonSets, static pods) to avoid deadlocks.

Key changes:
- Add Drain config section with WaitForVolumeDetach, VolumeDetachTimeout,
  and CacheSyncTimeout settings
- Create controller-runtime cache with field indexes for VolumeAttachment
  and Pod resources by spec.nodeName
- Implement getVolumeAttachmentsForNode() using Karpenter-style filtering
- Graceful degradation: if cache sync fails, feature is disabled but
  drain operations continue normally

Environment variables:
- DRAIN_WAIT_FOR_VOLUME_DETACH: Enable feature (default: false)
- DRAIN_VOLUME_DETACH_TIMEOUT: Max wait time (default: 60s)
- CACHE_SYNC_TIMEOUT: Cache sync timeout (default: 120s)
…tach

The spec.nodeName field selector is not supported by the Kubernetes API
server for VolumeAttachment resources. This caused the waitForVolumeDetach
function to fail silently.

Switch to using the controller-runtime cached client which has a custom
field indexer configured for spec.nodeName lookups on VolumeAttachments.
…onfig

- Fix typos: EnablePodInfomer -> EnablePodInformer, EnableNodeInfomer -> EnableNodeInformer
- Add nil checks in addIndexers and reportCacheSize for disabled informers
- Fix Start() to sync node/pod informers independently
- Fix VA getter logic to check vaAvailable flag instead of nil
- Add DisableWaitForVolumeDetach config option
- Update run.go to use new informer options
- Update tests to enable informers explicitly
…table-driven tests

- Move VolumeDetachmentWaiter from internal/actions to internal/volume package
- Rename to DetachmentWaiter for cleaner API within volume package
- Update drain handler and config to use new package location
- Convert TestShouldWaitForVolumeDetach and TestGetVolumeDetachTimeout to table-driven tests
- Rename config field VolumeDetachTimeout to VolumeDetachDefaultTimeout for clarity
Add explicit SelfSubjectAccessReview check before starting the VA informer
to verify get/list/watch permissions on volumeattachments.storage.k8s.io.

Benefits:
- Fast feedback (~50ms) vs waiting for 30s sync timeout
- Clear error messages indicating which specific permission is missing
- Graceful degradation - VA feature disabled if permissions not granted

The check uses the standard authorization.k8s.io/v1 SelfSubjectAccessReview
API which any authenticated user can call to check their own permissions.
@Trojan295 Trojan295 force-pushed the wait-for-volume-attachments-in-drain branch from ad83b3f to a777d17 Compare January 14, 2026 16:33
log logrus.FieldLogger,
clientset kubernetes.Interface,
castNamespace string,
volumeDetachTimeout time.Duration,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it make more sense to move the default timeout to the waiter instead? Then the following logic wouldn't be needed in the waiter constructor:

	if timeout == 0 {
		timeout = DefaultVolumeDetachTimeout
	}

and this drain handler wouldn't need to know about the default which should be a bit simpler logic

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes also sense, my reasoning was that we could reuse this somewhere else, where we want to have a different default timeout, but I don't see any objections against a default timeout on the waiter

vaIndexer cache.Indexer,
pollInterval time.Duration,
) DetachmentWaiter {
if vaIndexer == nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the caller already can check for such condition, no need to hide such complexity on lower levels. Constructors should return something useful if were called or return an error if something is missing. That should help to avoid silent failures

// Returns (true, nil) if all permissions are available.
// Returns (false, nil) if any permission is missing (logs which ones).
// Returns (false, error) if the permission check itself fails.
func (m *Manager) checkVAPermissions(ctx context.Context) (bool, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't be enough to just return an error in case of generic error, noPermissions type err or no error when everything is good?

@Trojan295 Trojan295 merged commit 1619850 into main Jan 15, 2026
3 checks passed
@Trojan295 Trojan295 deleted the wait-for-volume-attachments-in-drain branch January 15, 2026 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants